Using Git and GitHub in the Scottish Government
 

Alice Byers, Data Innovation Team

Scottish Government

Aims

  • These slides aim to demonstrate the use of Git and GitHub in the Scottish Government when working in the SCOTS network.

  • An understanding of the concepts of version control, Git and GitHub is assumed. More information can be found in the Introduction to Version Control slides.

  • Further guidance and resources for using version control on SCOTS are available on the Statistics Group sharepoint site.

Contents

Git setup

Install Git

  • Make an iFix request for “Git version 2.21.0.windows.1”

  • Git is a free and open-source software and therefore does not incur a cost to install

Git Bash

  • Git Bash is a command line interface for using Git

  • It is installed with Git

Git Bash

  • Some basic commands include:

    • pwd; prints present working directory

    • ls; lists files contained in working directory

    • cd <filepath>; changes working directory

  • Git Bash cheat sheet

  • ONS Learning Hub has further training on Command Line Basics

Set name and email

  • Your name and email is linked to every commit you make using Git

  • If you plan to use GitHub, make sure your user name and email address match those associated with your GitHub account

# Set user name
git config --global user.name <username>

# Set email
git config --global user.email <email>

# Check this has worked
git config --global user.name
git config --global user.email

Summary

  • Install Git via iFix

  • Set name and email address in Git Bash

Git workflow

Example project folder


Open Git Bash in working directory

  • Right click in project folder and select ‘Git Bash here’

  • Open Git Bash from Windows start menu

  • Use cd command to change directory to your working folder

Initiate Git respository

  • Run git init to initiate Git. This will create a .git folder within your project. (Note: you only need to do this once per project.)

git init

Create a gitignore file

  • The gitignore defines what files should not be tracked by Git

  • This is especially important if you plan on using GitHub as sensitive information should not be made available there

  • Generally, the following should be ignored:

    • Data files

    • Passwords or credentials

    • Code that contains sensitive information

    • Configuration files

Create a gitignore file

To tell Git to ignore these files:

  • Create a new file in your directory called .gitignore. This can be done in the usual way in File Explorer, or by using the touch command in Git Bash.

  • Open the gitignore file in a text editor (or R) and add names of folders and files to be ignored.

    • If you’re not sure what to include, this example gitignore contains many common data and R files. Copy the contents to the file you’ve just created.

git status

  • Use git status to show a summary of your Git repository - run this often to check that your other git commands have done what you expect them to do

  • Notice that data.csv is not listed here. This is because we have told Git to ignore csv files in the gitignore.
git status

git add

  • Use git add to ‘stage’ files for the next commit

    • Either list the files you’d like to stage; e.g. git add code.R, or

    • To stage all tracked files, use a full stop; e.g. git add .

git add .

git add

  • Use git status to check that the correct files have been staged

  • Files names are now coloured green and listed under ‘changes to be committed’
git status

git commit

  • Use git commit to commit the files to the Git history

git commit -m "Add files"

Your first commit

  • Running git status again shows that there are no further changes to commit

git status

Your first commit

  • Running git log --oneline will give a short summary of the commit history

git log --oneline

Make a change

  • Now, let’s make a change to code.R. Add some commented lines to give the script a title and description.

Before

After

Make a change

  • Run git status to check that Git has recognised the change

git diff

  • Run git diff to inspect what changes have been made to code.R. Green text highlights additions and red text highlights deletions.

Stage and commit the change

  • Use git add and git commit to stage and commit the change to code.R.

Stage and commit the change

  • Use git log to view the Git history - there are now two commits

Tips

  • Commit often (especially when you’re still learning)

  • Write commit messages that make sense (your future self and colleagues will thank you)

  • Run git status often (especially when you’re still learning)

    • Check which files have changes tracked

    • Check you have staged the correct files

    • Check files that should be ignored are not being tracked (and committed!)

Summary

  • Make a change

  • git add

  • git commit

  • git status often

Using Git with RStudio Integration

RStudio

  • You can also interact with Git from within RStudio (instead of using Git Bash)

  • Not all Git functionality is available, but it can be more user-friendly and convenient if you’re working with R

  • Changes are listed in the Git pane (usually in the top right window)

Open the commit window

  • Click the ‘Commit’ button to open up the commit window

Commiting a change from RStudio

  • Like Git Bash, you need to both stage and commit the change

  • To stage, tick the box next to each file you’d like to add (top-left pane)

  • To commit, enter a message and click ‘Commit’ (top-right pane)

GitHub and
Data Science Scotland

Create GitHub account

Data Science Scotland organisation

  • Work projects should all be hosted from the Data Science Scotland organisation

    • Easier to find related repositories

    • Easier to manage who can access repositories

    • When people leave the organisation, code doesn’t leave with them!

  • There’s lots of useful information in the welcome repository

Join Data Science Scotland organisation

Create a repository

Create a repository

Create a repository

  • Select Data Science Scotland as owner for work projects

  • Give the repository a name

  • Choose whether to make your repository public or private

  • Add a README file

  • Click the green ‘Create repository’ button

Using Git with GitHub

Remote repository

  • A ‘remote’ is a version of your Git repository hosted on the internet or network somewhere

  • This should be thought of as the main place where your repository is stored

  • Most commonly, GitHub is used to host remote repositories. But, it could also be a folder on an internal shared network.

Remote repository

  • Users take a copy (‘clone’) of the repository from the remote

  • Users regularly ‘push’ their changes back to the remote so other users have access to the latest version

  • Users regularly ‘pull’ from the remote to ensure they are working with the latest version

SSH keys

  • An SSH key is a way of identifying yourself to GitHub that means you don’t have to provide your username and password every time

  • You must use an SSH key when working with GitHub on SCOTS

  • To set up, generate an SSH key pair in Git Bash and add the public key to your GitHub account

You only need to create an SSH key once per device.

Create SSH key

  • Generate an SSH key in Git Bash

    ssh-keygen -t ed25519 -C "your_email@example.com"

    (Use the email registered with your Github account)

  • When asked where you want to create the key, press enter to use the default location

  • When asked if you want to set a passphrase, press enter twice to skip

Add SSH key to GitHub account

  • Copy the SSH key:

    # Navigate to the directory where you've saved your SSH key
    cd ~/.ssh
    
    # Print the contents of the public SSH key file
    cat id_ed25519.pub

    Copy the returned value from Git Bash.

  • Add the copied SSH key to your GitHub account by going to Settings, then SSH and GPG keys

Clone repository from GitHub

  • Now you have an SSH key set up, you should use the SSH URL to clone repositories from GitHub

  • Click the green ‘Code’ button and under ‘Clone’, select ‘SSH’, and copy the address

Clone repository from GitHub

  • Open Git Bash and navigate to the directory you’d like to clone the repository to

  • Use git clone <url> to clone the repository

  • Change directory to the cloned repository using cd

Make a change

Push to GitHub

  • ‘Push’ the commit to GitHub (the remote) using git push

  • Note that git status now says the repository is up to date with origin/main (this is another name for the remote repository)

Push to GitHub

  • The new file and commit is now visible by viewing the GitHub repository

Pull from GitHub

  • ‘Pull’ from GitHub regularly using git pull to ensure your local copy of the repository is up to date (especially if other people are also working on the repository)

  • Note that there is now an extra commit when running git log

Summary

  • Use SSH keys to connect to GitHub from Git Bash

  • git clone to clone a repository from GitHub

  • Follow Git workflow described earlier in slides

  • git push often to push local commits to GitHub

  • git pull often to pull new commits on GitHub to local copy/clone

There’s more!

Branches

  • When you create a Git repository, you begin with a single ‘branch’, usually called main

  • The main branch is usually viewed as the ‘production-ready’ version of your code

  • You can create new branches to make developments, test new ideas and to facilitate multiple people working on the code at the same time

  • Branches allow users to make changes to files without affecting the ‘production-ready’ main branch

  • Changes made in a branch can be peer reviewed before being merged into the main branch

  • More information on branching in the Duck book

GitHub features

  • Pull requests and code review

    • Open a pull request to merge a branch and request a code review from a collaborator
  • Issues

    • Keep track of bugs and requested enhancements for future development
  • Projects

    • A task board for planning and tracking work. Integrates with issues and pull requests.

Resources

Contact

Alice Byers

Reproducible Analytical Pipeline (RAP) Developer

Data Innovation Team, Scottish Government

GitHub profile